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An autoregressive-moving average model in which all roots of the 
autoregressive polynomial are reciprocals of roots of the moving aver- 
age polynomial and vice versa is called an all-pass time series model. 
All-pass models are useful for identifying and modeling noncausal and 
noninvertible autoregressive-moving average processes. We establish 
asymptotic normality and consistency for rank-based estimators of 
all-pass model parameters. The estimators are obtained by minimiz- 
ing the rank-based residual dispersion function given by Jaeckel [Ann. 
Math. Statist. 43 (1972) 1449-1458]. These estimators can have the 
same asymptotic efficiency as maximum likelihood estimators and are 
robust. The behavior of the estimators for finite samples is studied 
via simulation and rank estimation is used in the deconvolution of a 
simulated water gun seismogram. 

1. Introduction. Autoregressive-moving average (ARMA) models, the 
standard linear time series models for stationary data, are often fit to ob- 
served series using Gaussian likelihood, least-squares, or related second-order 
moment estimation techniques. These are effective methods for finding fitted 
ARMA models with second-order moment properties that resemble those of 
an observed series, whether or not the data are Gaussian. However, because 
every Gaussian ARMA process has a causal, invertible ARMA representa- 
tion (all roots of the autoregressive and moving average polynomials are 
outside the unit circle), in the non-Gaussian case, the second-order methods 
are unable to identify a noncausal (at least one root of the autoregressive 
polynomial is inside the unit circle) or noninvertible (at least one root of 
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the moving average polynomial is inside the unit circle) ARM A series. Fit- 
ted ARM A models obtained using second-order techniques may not, there- 
fore, most effectively capture the higher-order moment structure of the data. 
Consequently, an effort to identify noncausal and noninvertible series should 
be part of any ARMA fitting procedure. In this paper, we discuss all-pass 
models, which are useful tools for identifying and modeling noncausal and 
noninvertible ARMA processes. 

All-pass models are ARMA models in which the roots of the autoregres- 
sive polynomial are reciprocals of roots of the moving average polynomial 
and vice versa. These models generate uncorrelated (white noise) time series 
that are not independent in the non-Gaussian case. As discussed in [2], an 
all-pass series can be obtained by fitting a causal, invertible ARMA model 
to a series generated by a causal, noninvertible ARMA model. The residu- 
als follow an all-pass model of order r, where r is the number of roots of 
the true moving average polynomial inside the unit circle. Consequently, by 
identifying the all-pass order of the residuals, the order of noninvertibility 
of the ARMA model can be determined without considering all possible 
configurations of roots inside and outside the unit circle, which is compu- 
tationally prohibitive for large-order models. Noninvertible ARMA models 
have appeared, for example, in vocal tract filters [8, 9], in the analysis of 
unemployment rates [13] and in seismogram deconvolution [2, 19]. All-pass 
models can be used similarly to fit noncausal ARMA models [6]. See [6] for 
a list of applications for noncausal models. 

Estimation methods based on second-order moment techniques cannot 
identify all-pass models because Gaussian all-pass series are independent. 
Thus, cumulant-based estimators, using cumulants of order greater than 
two, are often used to estimate these models [8, 9, 11]. Breidt, Davis and 
Trindade [6] consider a least absolute deviations (LAD) estimation approach 
which is motivated by the likelihood of an all-pass model with Laplace (two- 
sided exponential) noise, and Andrews, Davis and Breidt [2] consider a max- 
imum likelihood (ML) estimation approach. The LAD and ML estimators 
are consistent and asymptotically normal. However, the LAD estimation 
procedure is limited by the assumption that the mean and median for the 
noise are equivalent, and the ML procedure is limited by the assumption 
that the probability density function for the noise is symmetric and known 
to within some parameter values. 

In this paper, we consider a rank-based estimation technique first pro- 
posed by Jaeckel [14] for estimating linear regression parameters. Jaeckel's 
estimator minimizes the sum of model residuals weighted by a function of 
residual rank. We study the asymptotic properties of Jaeckel's rank (R) 
estimator in the case of all-pass parameter estimation. This i?-estimator is 
more robust than the LAD and ML estimators; it is consistent and asymptot- 
ically normal under less stringent conditions. In addition, when ii-estimation 



RANK ESTIMATION FOR ALL-PASS MODELS 



3 



is used in lieu of LAD or ML, efficiency need not be sacrificed. There ex- 
ists a weight function for which i?-estimation is asymptotically equivalent 
to LAD estimation and, when the noise distribution is known, the weight 
function can be chosen so that i?-estimation is asymptotically equivalent to 
ML estimation. We also find that when the Wilcoxon weight function (a 
linear weight function) is used, i?-estimation is (relatively) very efficient for 
a large class of noise distributions. Another advantage of i?-estimation is 
that one has the flexibility to choose a weight function that tends to pro- 
duce relatively smooth i?-objective functions which can be minimized fairly 
easily. 

Because the objective function for Jaeckel's /^-estimation method involves 
not only the residual ranks, but also the residual values, this is not pure 
i?-estimation. Koul and Ossiander [16], Koul and Saleh [17], Mukherjee 
and Bai [20] and Terpstra, McKean and Naranjo [23] consider related rank- 
based estimation approaches for autoregressive model parameters. Also, Al- 
lal, Kaaouachi and Paindaveine [1] examine a pure i?-estimator for ARMA 
model parameters based on correlations of weighted residual ranks. The re- 
sults for this pure i?-estimator are not applicable to all-pass model parame- 
ters because the parameters in the autoregressive polynomial of an all-pass 
model are functions of parameters in the moving average polynomial and 
vice versa. 

In Section 2 we consider Jaeckel's ii-function in the context of all-pass 
parameter estimation. Asymptotic normality for i?-estimators is established 
under mild conditions and order selection is discussed in Section 3. Proofs 
of the lemmas used to establish the results of Section 3 can be found in the 
Appendix. We study the behavior of the estimators for finite samples via 
simulation in Section 4.1 and use ii-estimation in the deconvolution of a 
simulated water gun seismogram in Section 4.2. 

2. Preliminaries. 

2.1. All-pass models. Let i? denote the backshift operator (S'^Xt = X^.^, 
k = 0, ±1, ±2, . . .) and let (l){z) = 1 — cpiz — • • • — cppZ^ be a pth order au- 
toregressive polynomial, where (j){z) ^ for \z\ = 1. The filter (j){B) is said 
to be causal if all the roots of (j){z) are outside the unit circle in the com- 
plex plane. In this case, for a sequence {Wj}, (l)~^{B)Wt = {J2j^o''PjB-^)Wt = 
J2j^o ''Pj'^t-j, a function of only the past and present {Wt}. If (p{B) is causal, 
then the filter B^(f){B~^) is purely noncausal and hence B~'^(p~^{B~^)Wt = 
(J2'j^Q''PjB~^'~-^)Wt = J2'j^o''Pj'^t+p+j, a function of only the present and 
future {Wt}. See, for example. Chapter 3 of [7]. 

Let (poiz) = 1 — (poiz — ... — (pQpzP, where (j)o{z) / for \z\ < 1. Define 
000 = 1 and r = max{0 <j <p: (poj ^ 0}. Then a causal all-pass time series 
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is the ARMA series {Xt} which satisfies the difference equations 

(2.1) MB)x, = ^^^to(E:^z; 



or 



At-cpoiAt-i (pOrAt-r — + ~7 ^t-lH 1 -— ^t~r+l- ~;~ ^t-ri 

(pQr (pOr (pOr 

where the series {Z^} is an independent and identically distributed (i.i.d.) 
sequence of random variables with mean 0, variance € (0, oo) and distri- 
bution function F. The true order of the all-pass model is r (0 < r < p). 
Observe that the roots of the autoregressive polynomial (poiz) are recipro- 
cals of the roots of the moving average polynomial —(j)Q^z'^(l)o{z~^) and vice 
versa. 

The spectral density for {Xt} in (2.1) is 

|e-^''"P|(/>o(e^^)P _ 

which is constant for u G [— 7r,7r]. Thus, {Xt} is an uncorrelated sequence. 
In the case of Gaussian {Z^}, this implies that {Xt} is i.i.d. N(0, cr^</)Qji^), 
but independence does not hold in the non-Gaussian case if r > 1 (see [5]). 
The model (2.1) is called all-pass because the power transfer function of the 
all-pass filter passes all the power for every frequency in the spectrum. In 
other words, an all-pass filter does not change the distribution of power over 
the spectrum. 

We can express (2.1) as 

(2.2) 4>o{B)Xt = 7 Zt, 

-(POr 

where {Zt} = {Zt^p_^} is an i.i.d. sequence of random variables with mean 
0, variance o"^ and distribution function F. Rearranging (2.2) and setting 
Zt = (pQ^Zt, we have the backward recursion zt-p = (poizt-p+i + • • • + (pop^t — 
{Xt — (poiXt-i — ••• — (j)QpXt-p). An analogous recursion for an arbitrary, 
causal autoregressive polynomial (j){z) = 1 — (piz — • • • — (ppZ^ can be defined 
as 

{0, t = n+p,...,n + l, 
(Pizt-p+iicl)) + • • • + (ppzticj)) - <p{B)Xt, 
t = n,...,p+l, 

where := {(pi, . . .^(pp)'. Let 0o = ('^oi, • ■ • ,</'0p)' = ('^oi, ■ • ■ ,</'0r,0, ... ,0)' de- 
note the true parameter vector and note that {zt{4>Q)}^^i closely approx- 
imates {zt}^Zi] error is due to the initialization with zeros. Although 
{zt} is i.i.d., {zt{(t)Q)}^Zi "^^^ \.\A. if r > 1. 
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2.2. JaeckeV s rank function. Suppose we have a realization of length n, 
{Xt}^^i, from (2.1). Let A be a function from (0,1) to IR such that 

Al. A is strictly increasing and A(s) = — A(l — s) for all s G (0, 1). 

If cf) forms a causal pth order autoregressive polynomial and {Rt{4>)}^Zi 
contains the ranks of {2f(0)})L/' from (2.3), then the i?-function evaluated 
at with weight function A is 



Because it tends to be near zero when the elements of {zt{4>)} are sim- 
ilar, (2.4) is a measure of the dispersion of the residuals {zj(<^)}. When 



choice for the weight function is \{s) = s — 1/2. In this case, the weights 
{\{t/ {n — p + l))}[!rf are known as Wilcoxon scores. 

We give some properties for D in the following theorem. Jaeckel [14] shows 
that the same properties hold for the i?-function in the linear regression case. 

Theorem 2.1. Assume Al holds. For any 4>£W, if 



{Pi(0),...,P(„„p),(0)} 

= • • • ' 2l,n-p(0)}, • • • , {^{n-p)!,l(0), • • • , ^;(„-p)!,n-p(0)}} 



contains the {n—p)\ permutations of the sequence {zt{4>)}^^i , then 



In addition, D is a nonnegative, continuous function on W , and D((f)) = 
if and only if the elements of {zt{(f>)}^^i are all equal. 

Proof. See the proof of Theorem 1 in [14]. □ 

3. Asymptotic results. 

3.1. Parameter estimation. In order to establish asymptotic normality 
for /^-estimators of (pQ, we make the following additional assumptions: 

A2. F, the distribution function for the noise, is strictly increasing and 
differentiable on M with density /; 

A3. / is uniformly continuous on M with sup^gj^ |s|/(s) < oo; 



(2.4) 






n—p 



6 



B. ANDREWS, R. A. DAVIS AND F. J. BREIDT 



A4. the derivative of the weight function A exists and is uniformly continu- 
ous on (0, 1). 

Also, let J = \'^{s)ds, K = F-^{s)X{s)ds and L = f{F-^s)) x 
X'{s)ds and assume 

A5. a^L>K. 

Theorem 3.1. // Al~A5 hold, then there exists a sequence of minimiz- 



where S := {a^ J - K^) / [2{a^ L - Kf]a^T;\ Tp := - k)]l^^^ and 



is the autocovariance function for the autoregressive process {{1/ (j)Q{B))Zt} ■ 

Proof. D{(j)) - D{4>q) = Sn{\/n{4> - 4>o))i where is defined in 

Lemma A. 5 of the Appendix. Because Y := — |(/)or|c T" 
minimizes the limit £*(•) in Lemma A. 5, the result follows by Remark 1 
in [10]. □ 

Remark 1. i?-estimators of linear regression parameters are also con- 
sistent and asymptotically normal [14]. Note, however, that the conditions 
placed on A in assumption A4 are slightly stronger than those placed on the 
weight function in [14], where the weight function is square integrable, not 
necessarily bounded or continuous. The conditions in A4 can be relaxed to 
some extent at the expense of stronger assumptions on /, but we do not pur- 
sue those extensions here. Since piecewise continuous and unbounded weight 
functions on (0, 1) can be well approximated by differentiable, bounded 
weight functions, from a practical perspective, assumption A4 is not overly 
restrictive. 

Remark 2. Using the Cauchy-Schwarz inequality. 



with equality in (3.2) if and only if A is proportional to F~^, which is not 
possible since F~^(0) = — oo, F~^(l) = oo and A is bounded on (0, 1). Hence, 
o"^ J — k^ > 0. k = Jq F~^{s)X{s) ds is also greater than zero because F~^ 
and A are strictly increasing functions on (0,1) and A is odd about 1/2. 
Without assumption A5, a^L — k is not necessarily greater than zero, how- 
ever. If the density function / is differentiable, using integration by parts, it 
can be shown that 




V2(^^_<^q)4y~N(0,E), 



(3.2) 



CT^ J -k^ = a''E{X\F{Zi))} - (E{ZiA(F(Zi))})2 

> a^E{X\F{Z,))} - E{Zf}E{X\F{Z,))} = 0, 



L = E{/(Zi)A'(F(Zi))} 



— oo 



'OO 



f'{s)XiFis))ds 



'o f{F-\s)) 



X{s) ds. 
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Therefore, if Zi ~ N(0, a^), then 

a'L = -a' £ l^^l-M^X(s) ds = £ F~Hs)X{s) ds = K 

and so A5 does not hold if Zi is Gaussian. 

Remark 3. The asymptotic covariance matrix for cf)^ is a scalar mul- 
tiple of n^^a'^Tp^ , the asymptotic covariance matrix for Gaussian likeli- 
hood estimators of the parameters of the corresponding pth-order autore- 
gressive process. The same property holds for LAD and ML estimators of 
all-pass model parameters, as shown in [6] and [2], respectively. The LAD 
estimators are quasi-maximum likelihood estimators which can be obtained 
by maximizing the log-likelihood of an all-pass model with Laplace noise 
(/(s) = exp(— -v/2|s|/cj)/(\/2o')). The appropriate scalar multiple is 

(3 3) ^^"1^1 1 

2(2a2/(0)-E|Zi|)2 

in the LAD case ([6] contains an error in the calculation of the asymptotic 
variance; see [3] for the correction) and 

in the ML case, while the multiple in (3.1) for i?-estimation is 
(3.5) . 

Consequently, the asymptotic relative efficiency (ARE) for R to LAD is 
obtained by dividing (3.3) by (3.5) and the ARE for R to ML is obtained 
by dividing (3.4) by (3.5). 

Remark 4. Consider the sequence of weight functions {Am} such that 
Am(s) = 27r~^ arctan(m(s — 1/2)). It is straightforward to show that Am sat- 
isfies assumptions Al and A4 for all m > 0. If /{•} denotes the indica- 
tor function and /2 := medianjZi}, then A5 is satisfied for large m when 
2cj2/(/2) > E{—ZiI{Zi < p,} + ZiI{Zi > /i}}; this holds for many distribu- 
tions, including the Laplace, logistic and Student's t- (with degrees of free- 
dom greater than two) distributions, and various asymmetric distributions 
[0.4N(— 1, 1) -|-0.6N(2/3, 3^) is one example]. Because Am(s) converges point- 
wise to -I{s < 1/2} + > 1/2} on (0, 1) as m ^ oo, Jm = Jq A^(s) ds ^ 1 
and, if Zi has median zero. Km = ^{Zi^m{F{Zi))} — > E|Zi| and = 
E{f{Zi)X'^{F{Zi))} 2/(0). Hence, if Zi has median zero, 

a'^Jm.-Kl^ -E?\Zx\ Var|Zi| 



2(a2Lm - km? 2(2^2/(0) - E|Zi|)2 2(2a2/(0) - E|Zi|)^ 
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and so i?-estimation has virtually the same asymptotic efficiency as LAD 
estimation when the weight function is used with m large. If Zi has a 
Laplace distribution, LAD estimation corresponds to ML estimation. In the 
case of Laplace noise, therefore, i?-estimation with weight function Am and 
m large also has essentially the same asymptotic efficiency as ML estimation. 

Remark 5. Under general conditions, it can be shown that (3.5) equals 
(3.4) when the weight function is proportional to —f'{F~^{s))/f{F~^{s)). 
Thus, i2-estimation has the same asymptotic efficiency as ML estimation 
when an optimal weight function Xf{s) oc — f {F~^ (s)) / f {F~^ (s)) is used. 
A/ is also an optimal weight function in the case of fi-estimation for linear 
regression parameters (see, e.g., [15]). Note that if Zi has a Laplace distri- 
bution, then A/(s) oc -I{s < 1/2} + /{s > 1/2} for s E (0, 1/2) U (1/2, 1) (A/ 
does not exist at s = 1/2). 

If Zi has a logistic distribution, then /(s) = vr/ (VSa) exp(— s7r/(\/3o'))/[H- 
exp (-s7r/(\/3(j))]2 and so an optimal weight function A/ is given by the 
Wilcoxon weight function A(s) = s — 1/2. For the Wilcoxon weights, as- 
sumption A5 is satisfied when cr^E{/(Zi)} > E{ZiF{Zi)}, which holds for 
the Laplace, logistic. Student's t- and 0.4N(-1,1) + 0.6N(2/3, 3^) distribu- 
tions, as well as many others. Columns 2 and 3 of Table 1 give values of ARE 
for R (with Wilcoxon weights) to LAD and R (with Wilcoxon weights) to ML 
for a number of distributions. For the logistic and Student's t-distributions, 
i?-estimation is asymptotically much more efficient than LAD and essen- 
tially as efficient as ML. Also, even though ML estimation is asymptotically 
40% more efficient than i?-estimation (with Wilcoxon weights) when the 
noise distribution is Laplace, /^-estimation can still be useful in this case 
because D{-) tends to be smoother than J2t=i\^t{')\ hence easier to 
minimize. Figure 1 shows ML and R objective functions for a realization of 
length n = 50 from an all-pass model with p = 1, (/'oi = 0.5 and Laplace noise 
with variance one. Observe that the ML objective function has many local 
minima and thus could be difficult to minimize using numerical optimization 
techniques. 



Remark 6. Another weight function commonly used for /^-estimation 
is the van der Waerden weight function A(s) = <^~^(s), where $ is the stan- 
dard normal distribution function. Using results in [12], it can be shown 
that, if / is absolutely continuous and almost everywhere differentiable with 
< /^(/'(•s))^//(s) ds < oo, then A5 holds for the van der Waerden weights 
if and only if Zi is non-Gaussian. So, although A4 does not hold because <I>~^ 
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-0^ 0.0 0.5 



Fig. 1. ML and R {with Wilcoxon weights) objective functions for a realization of length 
n = 50 from an all-pass model with p = 1, ^oi = 0.5 and Laplace noise with variance one. 

is unbounded on (0, 1), a bounded weight function approximating which 
does satisfy the assumptions can be found for a large class of non-Gaussian 
noise distributions. However, since is optimal when Zi ~ N(0, cr^) and 
the parameters of a Gaussian all-pass series are not identifiable, the van der 
Waerden weights are not particularly useful for all-pass parameter estima- 
tion. Column 4 of Table 1 gives the ARE's for R (with Wilcoxon weights) to 
R (with van der Waerden weights) for various noise distributions. The van 
der Waerden weights are asymptotically superior to the Wilcoxon weights 
only when the distribution is close to Gaussian. 

3.2. Order selection. In practice, the true order r of an all-pass model is 
usually unknown and must be estimated. In this section, we give an order 
selection procedure that is analogous to using the partial autocorrelation 
function to identify the order of an autoregressive model. First, note that 

2{a^L - ky 2{a/\(t)or\L, - \^or\/(jk,Y ' 

where := F^^s) \{s) ds , := MF-\s)) X' (s) ds and and 
are the density and distribution functions, respectively, for zi = (jy^^Zi. Be- 
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cause 4>R^4>Q, 



(3.6) s:=[-Y.z}{^A ^(E{zf})^ 

and Kz := n~ D{cj)fi) — > ET^ by Lemma A. 6 in the Appendix. Corollary 3.1 
provides a consistent estimator of L^. 

Corollary 3.1. Consider the kernel density estimator of fz 

where k is a uniformly continuous, differentiahle kernel density function 
on R such that / |s In |s| < oo and k! is uniformly continuous 

on R, and where the bandwidth sequence is chosen so that bn ^ 

and bfj^y/n oo as n —> oo. If A1-A5 hold, then := J2t^=i -^'(V ~ 

Proof. If Fn{s) := (n - p)-'Et=i IM^r) < 4, F-\s) := inf{x: 
Fn{x) > s} and 



\'(s):=X'(^-] forsG 

\n—pj \n—p n—p 



t-l t 



, t=l,...,n-p, 



Table 1 

AREs for R {with Wilcoxon weights) to LAD, R {with Wilcoxon weights) to ML and R 
{with Wilcoxon weights) to R {with van der Waerden weights) for the Laplace 
distribution, the logistic distribution and Student's t-distribution with several different 

degrees of freedom 

ARE ARE ARE 

Noise distribution (R to LAD) (R to ML) (R to R) 



Laplace 


0.600 


0.600 


1.026 


logistic 


L976 


1.000 


1.049 


f(3) 


1.411 


0.962 


1.208 


t{6) 


2.068 


0.997 


1.083 


m 


2.354 


0.980 


1.023 


t{l2) 


2.510 


0.964 


0.990 


t{l5) 


2.607 


0.952 


0.971 


t{20) 


2.707 


0.937 


0.953 


t{30) 


2.810 


0.921 


0.938 
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then nLz/ {n — p) = fn{F~^ (s)) X'^ {s) ds. By the uniform continuity of A', 
suPsG(o,i) l^n(s) - 0- Consequently, since supgg(o,i) fz{F-^{s)) < oo 

and sup5g(-o^i) |A'(s)| < oo, the proof is complete if 

sup \UF~Hs))-UF~\s))\ 

sG(0,l) 

< sup \fn{F-\s)) - UF~\s))\ 
se(o,i) 

(3-8) 

+ sup \UF-\s))-UF-Hs))\ 

sG(0,l) 

P 

is Op(l). Because sup^£k|/„(s) — fz{s)\ — > (for proof of this result, see 
Lemma 16 on page 88 of [3]; a similar result is given in Theorem 3 of [21]), 
the first term in (3.8) is Op(l). We now consider the second term and use an 
argument similar to one found in the proof of Lemma 4 in [18]. Note that 
suPs£(o,i) \Fz{F~^ (s)) — s\ = supggiR \Fn{s) — Fz{s)\ and, using the Glivenko- 

Cantelli theorem, it can be shown that supggjg \Fn{s) — Fz{s)\ — > 0. Therefore, 
because fziF^^{-)) is uniformly continuous on (0, 1) and F~^{Fz{s)) = s for 
all s € M (since Fz is strictly increasing on M) , we have 

sup \fz{F-\s)) - fz{F-Hs))\ 

sG(0,l) 

= sup \fz{F~'[Fz{F~Hs)}]) - fz{F^Hs))\^0. 

SG(0,1) LI 

It follows that 

J-{s-^kzf p aM-K^ 



^^■^^ 2{sLz-s-^KzY 2{a^L-Ky' 

Note that the Gaussian and the Student ^-densities satisfy the conditions 
for the kernel density function k in Corollary 3.1. 

We now give the following corollary for use in order selection. 

Corollary 3.2. Assume A1-A5 hold. If the true order of the all-pass 
model is r and the order of the fitted model is p > r, then n^^'^4'p.R 
N{0,{a^J-K^)/[2{a^L-Kf]). 

Proof. By Problem 8.15 in [7], the pth diagonal element of is 
if p> r, so the result follows from (3.1). □ 

A practical approach to order determination using a large sample is de- 
scribed as follows: 
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Table 2 

Empirical means, standard deviations and percent coverages of 
nominal 95% confidence intervals for R-estimates of all-pass model parameters. 
The LAD-like score function A(s) = 2n~^ arctan(500(s — 1/2)) and the Wilcoxon 
score function A(s) = s — 1/2 were used. The noise distribution is Laplace with variance 

one 



Asymptotic Empirical 









std. dev. 


mean 


std. dev. 


% coverage 


n 


mean 


(LAD/ Wilcoxon) 


(LAD/Wilcoxon) 


(LAD/Wilcoxon) 


(LAD/Wilcoxon) 


500 


^1 


= 0.5 


0.0275/0.0354 


0.499/0.497 


0.0332/0.0593 


97.7/96.2 


5000 




= 0.5 


0.0087/0.0112 


0.500/0.499 


0.0093/0.0112 


97.9/96.0 


500 




= 0.3 


0.0291/0.0374 


0.299/0.299 


0.0413/0.0444 


96.5/94.9 






= 0.4 


0.0291/0.0374 


0.397/0.392 


0.0479/0.0599 


97.6/95.4 


5000 


4>i 


= 0.3 


0.0092/0.0118 


0.300/0.300 


0.0101/0.0122 


97.6/95.2 




4>2 


= 0.4 


0.0092/0.0118 


0.399/0.399 


0.0099/0.0119 


97.5/96.7 



1. For some large P, fit all-pass models of order p, p = 1,2, . . . ,P, via 
i2-estimation and obtain the pth coefficient, 4>p^R, for each. 

2. Let the model order r be the smallest order beyond which the esti- 
mated coefficients are statistically insignificant; that is, r = min{0 <p< 
P:\4>j,R\ < 1.96fn-i/2 j > ^j^g^.^ ^ qJ _ (s-^K^)2]/[2(sLz - 
s~^Kz)'^]y^'^ and the estimates s, Kz and Lz are from the fitted Pth- 
order model. 

4. Numerical results. 

4.1. Simulation study. In this section we give the results of a simula- 
tion study to assess the quality of the asymptotic approximations for fi- 
nite samples. First, for each of 1000 replicates, we simulated all-pass data 
and found (f}ji by minimizing D in (2.4). To reduce the possibility of the 
optimizer getting trapped at local minima, we chose 1000 random start- 
ing values for each replicate. We evaluated D at each of the 1000 candi- 
date values and then reduced the collection of initial values to the twelve 
with the smallest values of D. Optimized values were found using these 
twelve initial values as starting points. The optimized value for which D 
was smallest was chosen to be 0^. Confidence intervals for the elements of 
cj>Q were constructed using (3.1) and the estimator in (3.9). For the kernel 
density estimator (3.7), we used the standard Gaussian kernel density func- 
tion and, because of its recommendation in [22], page 48, we used bandwidth 
bn = 0.9n~^/^ min{s, /Q-R/ 1.34}, where s, defined in (3.6), is the sample stan- 
dard deviation for {zt{(f)ji)} and IQR is the interquartile range for {zt{(t>ji)}. 
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Results of these simulations appear in Tables 2 and 3. We show the em- 
pirical means, standard deviations, and percent coverages of nominal 95% 
confidence intervals for the -R-estimates of all-pass model parameters. The 
LAD-like score function A(s) = 27r~^ arctan(500(s — 1/2)) and the Wilcoxon 
score function A(s) = s — 1/2 were used. Asymptotic means and standard 
deviations were obtained using Theorem 3.1. Note that the /^-estimates ap- 
pear nearly unbiased and the confidence interval coverages are close to the 
nominal 95% level. The asymptotic standard deviations tend to understate 
the true variability of the estimates when n = 500, but are fairly accurate 
when n = 5000. Normal probability plots show that the i?-estimates are ap- 
proximately normal, particularly when n = 5000. The quality of the asymp- 
totic approximations for finite samples is similar for LAD and ML estimates 
(see [2] and [6]). 

We also ran simulations to assess the order selection procedure described 
in Section 3.2. For each of 100 replicates, we simulated all-pass data and 

Table 3 

Empirical means, standard deviations and percent coverages of nominal 95% 
confidence intervals for R-estimates of all-pass model parameters. The LAD-like 
score function A(s) = 2-k~^ arctan(500(s — 1/2)) and the Wilcoxon score function 



\{s) 


— s 


— 1/2 were used. The 


noise distribution is 


StudenVs t with three degrees of 








freedom 










Asymptotic 




Empirical 








std. dev. 


mean 


std. dev. 


% coverage 


n 


mean (LAD /Wilcoxon) 


(LAD/Wilcoxon) 


(LAD/Wilcoxon) 


(LAD/Wilcoxon) 


500 < 


h = 


0.5 0.0327/0.0279 


0.499/0.498 


0.0405/0.0331 


95.8/96.2 


5000 . 


h = 


0.5 0.0103/0.0088 


0.500/0.500 


0.0110/0.0090 


95.2/95.6 


500 . 


h = 


0.3 0.0346/0.0296 


0.301/0.299 


0.0403/0.0366 


95.1/94.7 




h = 


0.4 0.0346/0.0296 


0.396/0.396 


0.0418/0.0366 


95.2/94.9 


5000 < 


h = 


0.3 0.0109/0.0093 


0.300/0.300 


0.0118/0.0095 


94.0/95.4 




h = 


0.4 0.0109/0.0093 


0.400/0.400 


0.0115/0.0097 


94.6/95.1 








Table 4 






The frequencies for each estimate of model order r when P — 5 and the Wilcoxon scores 








were used 












Laplace noise 




t noise 


n 


Model parameters 


12 3 4 


5 1 


2 3 4 5 


500 




(/>o = 0.5 


58 7 10 8 


17 52 


8 9 13 18 


5000 




(r = 1) 


67 2 3 7 


21 56 


3 7 34 


500 




0^(0.3,0.4)' 


69 1 16 


14 


57 13 16 14 


5000 




(r = 2) 


82 5 5 


8 


81 5 6 8 
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estimated the model order r using the procedure in Section 3.2 with P = 5 
and Wilcoxon scores. Table 4 gives the frequencies for each estimate of r. In 
all cases, the procedure appears to be fairly successful at identifying the true 
value of r. Model orders less than r were never selected, so underestimating 
r is clearly not a concern. 

4.2. Deconvolution. Applications for all-pass models are not limited to 
uncorrelated time series. As discussed in Section 1 and in [2], all-pass models 
can also be used to identify and model noncausal and noninvertible ARMA 
series. If, for example, a causal, invertible ARMA model is fit to a causal, 
noninvertible series, the residuals follow a causal all-pass model of order r, 
where r is the number of roots of the true moving average polynomial inside 
the unit circle. Therefore, the order of noninvertibility of the ARMA, r, can 
be determined by identifying the all-pass order of the residuals. 

Consider the simulated water gun seismogram {Xt}j£'}'^ shown in Fig- 
ure 2(a), where Xt = J2k f^kZt-k, {Pk} is the water gun wavelet sequence in 
Figure 8(2) of [19] and {Zt} is a reflectivity sequence which was simulated 
as i.i.d. noise from the Student t-distribution with five degrees of freedom. 
Andrews, Davis and Breidt [2] modeled {Xt} as a possibly noninvertible 
ARMA, using ML estimation for all-pass models to identify an appropriate 
order of noninvertibility. The wavelet and reflectivity sequences were then 
reconstructed from {Xt} using the fitted ARMA model. This deconvolution 
procedure is of interest because, for an observed water gun seismogram, the 
reflectivity sequence is unknown and corresponds to reflection coefficients 
for layers of the Earth. In this section, we identify an appropriate order 
of noninvertibility for {Xt} using i?-estimation for all-pass models and we 
compare the i?-estimation results to ML results in [2]. 

Andrews, Davis and Breidt [2] first fit a causal, invertible ARMA(12, 13) 
model (l){B)Xt = 9{B)Wt to the simulated seismogram {Xt} using Gaussian 
ML. The residuals from this fitted ARMA model are denoted {VFt}. From the 
sample autocorrelation functions for {M^tji {W^t^} and {|M^t|} in Figure 2(b)- 
(d), it appears that these ARMA residuals are uncorrelated but dependent, 
suggesting that a causal, invertible model is inappropriate for {Xt}. Using 
ML estimation and the Student t-density, a causal all-pass model of order 
two was determined to be most suitable for {Wt} [2]. The ML estimates of 
the all-pass model parameters are = (1.5286, —0.5908)', both with stan- 
dard error 0.0338. Since the all-pass residuals appear independent, Andrews, 
Davis and Breidt [2] concluded that a causal, noninvertible ARMA(12,13) 
with two roots of the moving average polynomial inside the unit circle is an 
appropriate model for {Xt}. 

When the Wilcoxon weight function, the standard Gaussian kernel density 
function and bandwidth 6„ = 0.9n~^^^ min{s, IQR/1.34:} are used, the order 




Fig. 2. (a) The simulated seismogram of length 1000, {Xt} , and the sample autocorre- 
lation functions with bounds ±1.96/^/l000 for (b) {Wt}, (c) {W?} and (d) {\Wt\}. 



selection procedure described in Section 3.2 also indicates that an all-pass 
model of order two is appropriate for {Wt}. The i?-estimates of the all-pass 
model parameters are 0^ = (1.5052,-0.5700)', both with standard error 
0.0343. In Figure 3, we show the sample autocorrelation functions for the 
squares and absolute values of {Zt}, the residuals from the all-pass model 
fit to {Wt} using /^-estimation; these all-pass residuals appear independent. 
Therefore, in this example, the ML and R all-pass estimation results are 
nearly identical, even though no specific distributional information was used 
for i?-estimation. 



16 



B. ANDREWS, R. A. DAVIS AND F. J. BREIDT 



(a) ACF of 



CO 

d 



CO 

d 



d 



IN 

d 



o 
d 



10 



— I — 

20 
Lag 



30 



m 
d 



d 



d 



40 



(b) ACF Of IZ,I 



rr 



10 



— \ — 
20 

Lag 



30 



40 



Fig. 3. Diagnostics for the all-pass model of order two fit to the causal, invertible 
ARMA residuals using R-estimation. The sample autocorrelation functions with hounds 
±1.96/^1000 /or (a) {if} and (b) {\Zt\} are shown. 



APPENDIX 

This section contains proofs of the lemmas used to establish the results 
of Section 3. We assume that assumptions A1-A5 hold throughout. First, 
note that for j G {1, . . . ,p} and t G {1, . . . ,n — p}, 

(see [2] for details). Evaluating (A.l) at the true value of and ignoring the 
effect of recursion initialization, we have 



d^j MB-^) I MB) 

(A.2) ~ + ^*±L 



+ zt+j{4>o) 



MB) MB^^y 

where the first term is an element of cr(zj_i, Zj_2, • • •) and the second term 
is an element of a{zt+i-,zt+2-,- ■ ■) because (po{B) is a causal operator and 
0o(S~^) is a purely noncausal operator. It follows that (A.2) is independent 
of zt = (f)^^Zt. Thus, if Fz is the distribution function of zi and gt{(t)) ■= 
X{Fz{zt))zt{4>), then for j £{!,... ,p}, 

The expected value of dgf{cf)Q)/d(pj is zero by the independence of its two 
terms. 
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We now compute the autocovariance function 7^(/i) of the zero-mean, 
stationary process {u'dg^{(f)Q)/d(l)} for u G MP: 



dcf) 



dcf) 



where 

'2</>o,2j7(j-fc), 

and the ■0; are given by J^iZoi^i^'' = l/</'o(-2^) with V'/ = for I < 0. Thus, 



if = 0, 
if /i / 0, 



7t(0)+2^7^(/i) 



u'<' [20o,V7(j - k)]l,^, - 2<Ao;'^' 



/l=l 



>U 



The preceding calculations lead directly to the following lemma. 

Lemma A.l. As n^oo, n^^/^ ^"Jf %(0o)/a<^ 4 N ~ N{0,2(j)^^ 
{a'^J-K'^}a~'^Tp). 

Proof. Note that, for t G {0, . . . ,n - p - 1}, 



^n—p—t 



^i^i{4>o{B ^)zn-p-t+i) and 



(A.3) 



^n—p—t 



{(t^o) = ^MMB ^)Zn-p-t+l)- 



Because there exist constants c > and < d < 1 such that l^/";] < cd' for all 
^ G {0, 1, . . .} (see [7], Section 3.3), we have 



n—p 

EE 

t=l 



n—p 

EE 

t=l 



MB-') MB-') 



Zt+j 



Oil) 



fori G {1, . . . ,p}. Consequently, n-'/^j:t=i[d9t{(f>o)/d<t>-dg:{(l>o)/d<t,] ^ 
in Li and hence in probability. 

Let u G MP. By the Cramer- Wold device, it suffices to show that 

J2t=i^'d9t{(t>o)/d4> ^ u'N ~ N(0,2(^o^2{a2j- J^2}uV-2rpu). Elements of 
the infinite-order moving average stationary sequence {u'dgf{(f)Q)/dcj)} can 
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be truncated to create a finite-order moving average stationary sequence. 
By applying a central limit theorem ([7], Theorem 6.4.2) to each truncation 
level, asymptotic normality can be deduced. The details are omitted. □ 

Now, consider the mixed partials of gt{4>)- For j, A; G {1, . . . 



and so 



(A.4) 



1 



-Zt+j~k — Zt+k- 



+ 



2z, 



t+j+k 



oo oo 

- X! 51 '^mlPdzt+j-k-e+m + Zt+k-j-e+m) + 



m=Q i=0 



'^Zt+j+k 



d'zt{<t>o) 
d(l)j d4>k 

( oo oo 



KFz{zt)){ - ^i^m'(plizt+j-k-l+m + Zt+k-j-l+m) 

[ m=oe=o 



+ 



2z-, 



t+j+k 



(A.4) has expectation -2(T~^-f{j - k) J^^ (s) X{s) ds = -2 
7(j - k). 



Lemma A. 2. As n^oo, n-^YHZf d'^9t{(IJo)/{d^d(t)') -2\(por\ 



■p- 



Proof. It can be shown that n-'^YJi=i{d'^9t{^o)/{d(t)d(t)') - 
d"^ gl{4>Q) / {d<j) d<j)')] ^ in Li and in probability. Because (A.4) has expecta- 

tion -2\(por\-^ka-^^{j-k), EUT^'^rl'/'o)/!^^') ^ -2\M-^Ka-^ 
by the ergo die theorem. □ 



Lemma A. 3. For any T £ {0,oo), as n ^ oo, 



(A.5) 



sup 

l|u||<T 



n 



-1/2 



t=i 



i?f(0o + n-V^u) 
n — p + 1 



d(j) 



2\(i)Qr\~^Lx\'Vp\X 



RANK ESTIMATION FOR ALL-PASS MODELS 19 
Proof. Observe that the left-hand side of (A. 5) is bounded above by 



(A.6) 



sup 

l|u||<T 



1 srw\'(F (~ ^i ^^*(^o) 



n — p+1 



2\(l)Qr\~^LvLTpM 



(A.7) 



sup 

l|u||<r 



n— p+1 



FM) 



where F^*„(u) is between Fz{zt) and Rt{cf)Q + n ^/^u)/(n — p+l). If Fn{x) 
^"j"/'/{zt < x}, an upper bound for (A.6) is 



sup 

llu||<T 



(A.8) 



^ 2^ u A {F,{zt)) 



n-p+l \ \ ^Jn 



u 



(A.9) 



+ sup 

l|u||<T 



1 ^ .^92:t(0o) 



5:u'A'(F,(zO)- 



dcf) 



u 



Fn[zt[ct)Q + ^\\-F-Azt[ct)Q + ^ 



u 



(A.IO) 



+ sup 

l|u||<T 



1 ^ ^^dzt{4>Q) 



J2u'X'iFzizt))- 



t=i 



dcf) 



u 



2\(t)Qr\~ Lvi'TpVi 



Because 



n—p 



sup -J2 

l|u||<r™ i=l 



u'A'(F,(.,)) 



dcf) 
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and, by Lemma 3 on page 55 of [3], 

i2t(0o + n-i/2u) 



(A. 11) sup ^fn 

||u||<T,te{l,...,n-p} 



n — p + 1 



Fn[ztlcf)o + 



u 



0, 



(A. 8) is Op(l). Lemma 10 on page 76 of [3] establishes that (A. 9) is Op(l). 
Finally, 



sup 

l|u||<T 



^i:u'A'(R(..))^ 



u 



sup 

llu||<r 



n—p 



< sup 



t=l 



n—p 



-j=Y^^X'{FM)) 



+ sup 

l|u||<T 



+ sup 

l|u||<T 



i=l 

n~p 



— 2|(^0r| Lu FpU 



^^/,(4„(u))(z,(0o)-^*) 



n 



t=i 



fdzt{<t)Q) 



dcf) 



2|(/)0rr^i^uTpU 



1 n~p 

^EA'(F.(.,))/.«„(u))u 



,dzt{<l}o)^, d'zt{cf>l^iu)) ^ 



del) dcf}' 



where fz is the density function for zi, „(u) is between zt and Zt{4>o + 
n~^/^u) and 0^ „(u) is between 0q and 0o + 

From (A. 3), the first 

term on the right-hand side is Op(l) and, since there exists a geometrically 
decaying, nonnegative, real-valued sequence {7ffc}^_^ such that 

,9'^t(0t%(u)) 



sup 

l|u||<T 



U 



-u 



< TTk\zt-k\ yt£{l,...,n-p} 

k = — OD 



for all n sufficiently large ([7], Section 3.3), the third term is also Op(l). 
Using the uniform continuity of fz, the second term equals 



sup 

llu||<T 



ii:v(F.(.,))A(.,)(u';^ 



t=i 



- 200-2 (^1^' fz{F-Hs))X'{s)ds^ u'T.u 



+ Op{l), 
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which is Op(l) by the ergodic theorem. Therefore, (A. 10) and, consequently, 
(A. 6) are Op(l). Similarly, using the uniform continuity of A', it can be shown 
that (A.7) is Op(l). □ 



Lemma A. 4. For any T e {0,oo), as n ^ oo, 



sup 

l|u||,||v||<T 



n 



t=i 



A 



n — p + 1 



KFM)) 



dcjidcj)' 



u 



0. 



Proof. Note that because is a continuous distribution function, it is 
also uniformly continuous on M. Using (A. 11), the Glivenko-Cantelli theo- 
rem and the uniform continuity of -Fj, for any e, 77 > 0, it can be shown that 
there exists an integer m such that 



sup 

W\\<T,te{l,...,n~p~m} 



<P 



sup 

|v||<T,tG{l,...,n-p} 



n — p+1 



> rj 



n— p+1 



Fn[zt[ct)o + y= 



+ P(sup|F„(a;)-F^(x)| 



sup 

\v\\<T,t£{l,...,n-p-m} 



FAzt{(l)o + 



F,{zt) 



> 



is less than e for all n sufficiently large. Hence, 



sup 

|lu||,||v||<T 



n—p 



t=i 



A 



sup 

llu||,||v||<T 



n — p + 1 



n 



^u'A'(F*„(v))- 



t=i 



r i?f(0o + n-VV 

d(f)d(f)' I n — p+1 



u 



u 



where Ff*„(v) is between Fz{zt) and Rt{(po + n -'^/^v)/(n — p+1), since 



1 



n—p 



sup - E 

l|u|M|v||<^^^ 



Op(l). 



□ 



For u G and 61,62 e [0, 1] , let 



[/„(u,(5i,52) = E 



t=l 



n — p + 1 



zt[(t}Q + 



00 + 
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n—p 



t=l 



n— p + 1 



62U 



6i\i 



Using Taylor series expansions, 

Uniu,6l,52) 



t=i 



n — p+l 



52-h\^ ,^( Rt{(t>o + n~^'HiM) \ dzt{ct>o) 

■z."^^ ^r^^l )—d^ 



n 



i=l 



+ 



1 51 - 51 ,fRt{cl,^+n-yH,M)\d^zt{(t>^) 



2 n 



t=i 



n 



-p+l ) dcl)dcf)' " 



2 n 



t=i 



(A.12) 



n — p+l 



^d^zt{(l>Uu, 51,62)) d^zt{(t>^) 



del) del)' del) del)' 

2 n ^ 



u 



n — p + l 



^d^ztiel)Uu,5i,5i)) d^ztiel). 



del) del)' del) del)' 



u 



and, similarly, 

Ki(u,5i,(52) 



2^u A 



n 



t=i 



n— p+l 



def) 



+ 



1 51 - 51 V^^J Rtjct)^ + n-V^J2u) \ a^z,(0o) ^ 

2 n V n-p+1 ) depdel)' " 



Ijr^^ (RM,+n-^/H2V. 
2 n 



(A.13) 



t=i 



77, — p + 1 

52zi(0:(u,52,<52)) d^ztiel)^) 



del) del)' del) del)' 



u 
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2 n 



t=i 



n — p + 1 



^d^zticl^Uu, 62,61)) d''zt{cji^ 



d(t)dcj)' dcj)d(t)' 
where the values of </>5^(u, •, •) he between (f)^ and 4>q + n~^/^u. 

Lemma A. 5. For u G W , let 5„(u) = D{4>q + n^^/^u) _ d{4)q) and 
5(u) = u'N + |0orrHf^^^-^)uV-2rpU, wheren r^^{Q,2(t)^^{a'^ J - k'^]a' 

Then S{-) on CiW*), the space of continuous functions on W where 

convergence is equivalent to uniform convergence on every compact set. 

Proof. Let u G M*' and suppose that m is any positive integer. Because 

{k-l)u 



D{(l,, + n-^/'u)-D{cl>,) = Y^ 
we have 



m r / 7 

I) 00 + 



fc=l 



m\/n 



m\/n 



(A.14) 



k=l 



k-l k 



m m 



< D{<l}o + n-'/^u) - D{(j>^) < ^ K ( u, 

k=i 



k-l k 



m m 



by Theorem 2.1. Using (A.12), (A.13) and Lemmas A.l, A.2, A.3 and A.4, 

Un(u,0,- 

\ m 

Un \ U, — , — 

V mm 



Unl u, ,1 



m 

Vn(u,0,- 

V m 

K(u,l,- 

m m 



m — 1 



m 
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— u'N 

m 

— u'N+|0Or|" 

m 



'Or I 
1 



-)■ 

m I 



-) 

m I 



-u'N+|<Aor.|-^f2^^a2L 
— u'N+|0orrM2-^cj2L 



— u'N+|0orr^f24TCT^i 



1 



on M^™ , since 



u'N + 



1 



n— p 



2 \2 

•ml 



m 
m 

-)' 

m J 

-)' 

m I 



K u'o-2r„u 



m - 1\ ^ 



m 



A' lu'a^^r u 



K )u'o-2r„u 



m- 1\ ^ 



K )u'(7-2rpu 



sup - 51 

u||,||v||<T J^;^ 



u 



d(t)dcf}' 



u 



for any T > and sup^gj-Q < oo by the uniform continuity of A'. 

Hence, 



k=l ^ 
m / 



k-\ k 



k=l 
)2 



m m 
k-1 k 



m m 



u'N + 



u'N + 



m — 1 



m 



m 



on M . For any e > 0, there exists an integer m sufficiently large so that 
u'N + \<))or\-^ ( "^^a^L - k] u'a-2r„u 



m 



and 



u'N + 



m + 1 2 



m 



are both in an e-neighborhood of S{u) = u'N + | 



a^L - ir)u'o-^rpU. 



Thus, for any u G RP, 5„(u) A 5(u). It can be shown similarly that all 
finite-dimensional distributions of «S'„(-) converge to those of S{-). 
Also using (A. 14), it can be shown that 

lim limsupPi sup |5„(u) — ^^(v)! > ?] | = 

<5-»0+ n^co \u,ve-ft:,||u-v||<(5 / 
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for any t] > and any compact subset K cMP (see [3], pages 84-86). It 
follows that Sn{-) must be tight on C{K) and, therefore, because compact 

KcWP is arbitrary, S'„(-) 4 S{-) on C{W) by Theorem 7.1 in [4]. □ 

Lemma A. 6. If e > is sufficiently small so that cf) forms a causal poly- 
nomial for all cf)£^ := {cj)€W: \\cj) - (/)o|| < e}, then Et"=f -zK^) 
E{zf{4>)} andn"^D{(p) "—f-' Jq F^^^^{s) X{s) ds uniformly on^, wherezt{4>) := 
—4>~'^{B~^)(j){B)Xt+p and Fz(^(i)-j{-) is the distribution function for zi{cj)) . 

Proof. For any n'^ T.'lZi 4{(t>) ^' E{if (0)} and n-^D{cj}) 

E{A(F^((^)(zi(0)))5i(0)} = /(}F-"^^j(s)A(s)ds, by the ergodic theorem. There- 
fore, since n~^Y^^Zi z^{ ) and n~^D{-) are equicontinuous and uniformly 
bounded on $ almost surely (see Lemma 15 on page 86 of [3]; similar re- 
sults are obtained in the proof of Proposition 1 in [6]), the lemma follows 
by the Arzela-Ascoli theorem. □ 
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